Parallel Classification on SMP Systems
نویسندگان
چکیده
This paper presents fast scalable decision-tree-based classification algorithms targeting shared-memory systems. The algorithms are based on the sequential SPRINT classifier and span the gamut of data and task parallelism. The data parallelism is based on attribute scheduling among processors. This is extended with task pipelining and dynamic load balancing to yield more efficient schemes. The task parallel approach uses dynamic subtree partitioning among processors. These schemes are disk based and achieve excellent speedup, making them ideally suited for data mining in very large databases.
منابع مشابه
Parallel Classification for Data Mining on Shared-Memory Multiprocessors
We present parallel algorithms for building decision-tree classifiers on shared-memory multiprocessor (SMP) systems. The proposed algorithms span the gamut of data and task parallelism. The data parallelism is based on attribute scheduling among processors. This basic scheme is extended with task pipelining and dynamic load balancing to yield faster implementations. The task parallel approach u...
متن کاملA New Prediction Oriented Barrier Synchronization on SMP Clusters
Clusters of Symmetric Multiprocessors (CSMP) are becoming an increasingly popular high-performance computing platform due to the commodity availability of multiprocessor nodes, mature SMP operating systems, low-latency, highbandwidth data networks, and superior price-performance ratio. Fast synchronization is crucial to making efficient use of SMP clusters. In this paper, we focus on one kind o...
متن کاملScalable Data Mining for Rules
Data Mining is the process of automatic extraction of novel, useful, and understandable patterns in very large databases. High-performance scalable and parallel computing is crucial for ensuring system scalability and interactivity as datasets grow inexorably in size and complexity. This thesis deals with both the algorithmic and systems aspects of scalable and parallel data mining algorithms a...
متن کاملA Taxonomy of Programming Models for Symmetric Multiprocessors and SMP Clusters
The basic processing element, from PCs to large systems, is rapidly becoming a symmetric multiprocessor (SMP). As a result, the nodes of a parallel computer will often be an SMP. The resulting mixed hardware models (combining shared-memory and distributed memory) provide a challenge to system software developers to provide users with programming models that are portable, understandable, and eff...
متن کاملAn SMP soft classification algorithm for remote sensing
This work introduces a symmetric multiprocessing (SMP) version of the continuous iterative guided spectral class rejection (CIGSCR) algorithm, a semiautomated classification algorithm for remote sensing (multispectral) images. The algorithm uses soft data clusters to produce a soft classification containing inherently more information than a comparable hard classification at an increased comput...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998